Centralize Turbo export/push config, make quantization defaults deterministic, and prepare v2.1 pre-release#25
Conversation
Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/aa78d528-be1d-4467-813d-711a55ade22a Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/aa78d528-be1d-4467-813d-711a55ade22a Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>
There was a problem hiding this comment.
@copilot The changes is correct but we need to update the all Documentation and were that old code use update it to new One API and also pre-release to v2.1 .
There was a problem hiding this comment.
Pull request overview
This PR centralizes export/push configuration at the Turbo/TurboModel level to avoid per-call duplication and to make default export/push behavior deterministic across export() and push().
Changes:
- Added shared
configsupport toturbo(...)andTurboModel.from_pretrained(...), resolved once intoTurboModel.export_push_config. - Updated
export()andpush_to_hub()to consume the resolved internal config when per-call args are omitted. - Added tests plus documentation/examples updates to reflect configuring export/push defaults once at initialization.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
quantllm/core/turbo_model.py |
Introduces default export/push config, resolver, and uses it to default export()/push_to_hub() behavior. |
tests/test_export_push_config.py |
Adds coverage for deterministic config resolution and propagation into export()/push(). |
docs/guide/hub-integration.md |
Updates hub examples to show one-time config at turbo(...) init. |
docs/api/turbo.md |
Documents the new config parameter on turbo(...). |
README.md |
Updates quickstart snippet to use shared config + argument-free export()/push(). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@copilot apply changes based on the comments in this thread |
Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/4f6f0bc8-c954-444c-afa4-6244c9cf4865 Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>
Agent-Logs-Url: https://github.com/codewithdark-git/QuantLLM/sessions/4f6f0bc8-c954-444c-afa4-6244c9cf4865 Co-authored-by: codewithdark-git <144595403+codewithdark-git@users.noreply.github.com>
Done in
Applied the review-thread fixes in |
Export/push behavior required repeating format/quantization across multiple calls, which could diverge (
export()vspush()) and produce inconsistent outputs. This change introduces a single Turbo-level config for downstream export/push behavior with explicit defaults, format-safe quantization behavior, and v2.1 pre-release documentation/version updates.Shared Turbo-level export/push config
configsupport toturbo(...)andTurboModel.from_pretrained(...).TurboModel:format: "safetensors"push_format: "safetensors"quantization: "Q4_K_M"push_quantization: Nonepush_quantization(explicitNoneis preserved).export()now consumes internal configformatis optional; when omitted, uses shared config.Q4_K_M) unless explicitly overridden.format/quantizationargs still take precedence.push()now consumes internal config with format-safe quantizationformatis optional; when omitted, usespush_format.push_quantizationis no longer always-on by default.Docs + examples updated for new usage
export()/push()without repeating format/quantization in common GGUF flows.Pre-release version update
v2.1.0rc1for pre-release preparation.Focused coverage for config propagation
push_quantizationoverride behavior,tmp_path.